33 research outputs found

    A novel scoring schema for peptide identification by searching protein sequence databases using tandem mass spectrometry data

    Get PDF
    BACKGROUND: Tandem mass spectrometry (MS/MS) is a powerful tool for protein identification. Although great efforts have been made in scoring the correlation between tandem mass spectra and an amino acid sequence database, improvements could be made in three aspects, including characterization ofpeaks in spectra, adoption of effective scoring functions and access to thereliability of matching between peptides and spectra. RESULTS: A novel scoring function is presented, along with criteria to estimate the performance confidence of the function. Through learning the typesof product ions and the probability of generating them, a hypothetic spectrum was generated for each candidate peptide. Then relative entropy was introduced to measure the similarity between the hypothetic and the observed spectra. Based on the extreme value distribution (EVD) theory, a threshold was chosen to distinguish a true peptide assignment from a random one. Tests on a public MS/MS dataset demonstrated that this method performs better than the well-known SEQUEST. CONCLUSION: A reliable identification of proteins from the spectra promises a more efficient application of tandem mass spectrometry to proteomes with high complexity

    Bioaccumulation of Hg in rice leaf facilitates selenium bioaccumulation in rice (Oryza sativa L.) leaf in the Wanshan mercury mine

    Get PDF
    Mercury (Hg) bioaccumulation in rice poses a health issue for rice consumers. In rice paddies, selenium (Se) can decrease the bioavailability of Hg through forming the less bioavailable Hg selenides (HgSe) in soil. Rice leaves can directly uptake a substantial amount of elemental Hg from the atmosphere, however, whether the bioaccumulation of Hg in rice leaves can affect the bioaccumulation of Se in rice plants is not known. Here, we conducted field and controlled studies to investigate the bioaccumulation of Hg and Se in the rice-soil system. In the field study, we observed a significantly positive correlation between Hg concentrations and BAFs of Se in rice leaves (r2 = 0.60, p < 0.01) collected from the Wanshan Mercury Mine, SW China, suggesting that the bioaccumulation of atmospheric Hg in rice leaves can facilitate the uptake of soil Se, perhaps through the formation of Hg-Se complex in rice leaves. This conclusion was supported by the controlled study, which observed significantly higher concentrations and BAFs of Se in rice leaf at a high atmospheric Hg site at WMM, compared to a low atmospheric Hg site in Guiyang, SW China

    Multi-site, Multi-domain Airway Tree Modeling (ATM'22): A Public Benchmark for Pulmonary Airway Segmentation

    Full text link
    Open international challenges are becoming the de facto standard for assessing computer vision and image analysis algorithms. In recent years, new methods have extended the reach of pulmonary airway segmentation that is closer to the limit of image resolution. Since EXACT'09 pulmonary airway segmentation, limited effort has been directed to quantitative comparison of newly emerged algorithms driven by the maturity of deep learning based approaches and clinical drive for resolving finer details of distal airways for early intervention of pulmonary diseases. Thus far, public annotated datasets are extremely limited, hindering the development of data-driven methods and detailed performance evaluation of new algorithms. To provide a benchmark for the medical imaging community, we organized the Multi-site, Multi-domain Airway Tree Modeling (ATM'22), which was held as an official challenge event during the MICCAI 2022 conference. ATM'22 provides large-scale CT scans with detailed pulmonary airway annotation, including 500 CT scans (300 for training, 50 for validation, and 150 for testing). The dataset was collected from different sites and it further included a portion of noisy COVID-19 CTs with ground-glass opacity and consolidation. Twenty-three teams participated in the entire phase of the challenge and the algorithms for the top ten teams are reviewed in this paper. Quantitative and qualitative results revealed that deep learning models embedded with the topological continuity enhancement achieved superior performance in general. ATM'22 challenge holds as an open-call design, the training data and the gold standard evaluation are available upon successful registration via its homepage.Comment: 32 pages, 16 figures. Homepage: https://atm22.grand-challenge.org/. Submitte

    Bioaccumulation and Health Risk Assessment of Heavy Metals in the Soil-Rice System in a Typical Seleniferous Area in Central China

    No full text
    Heavy metals are rich in seleniferous areas; however, the bioaccumulation and health risk of heavy metals are poorly understood, given the fact that selenium (Se) can inhibit the phytotoxicity and bioavailability of many heavy metals. The present study investigated the bioaccumulation of heavy metals in the soil-rice system in the Enshi seleniferous area of central China. Soils were contaminated by Mo, Cu, As, Sb, Zn, Cd, Tl, and Hg caused by the weathering of Se-rich shales. Among these heavy metals, Cd and Mo had the highest bioavailability in soils. The bioavailable fractions of Cd and Mo accounted for 41.84 and 10.75% of the total Cd and Mo in soils, respectively. Correspondingly, much higher bioaccumulation factors (BAFs) of Cd (0.34) and Mo (0.46) were found in rice, compared with those of other heavy metals (Zn 0.16, Cu 0.05, Hg 0.04, and Sb 0.0002). For the first time-to our knowledge-we showed that the uptake of Hg, Cd, and Cu by rice could be inhibited by the presence of Se in the soil. The probable daily intake (PDI) of Se, Cd, Mo, Zn, and Cu through consumption of local rice was 252 +/- 184, 314 +/- 301, and 1774 +/- 1326 mu g/d; and 7.4 +/- 1.68 and 0.87 +/- 0.35 mg/d, respectively. The high hazard quotients (HQs) of Mo (1.97 +/- 1.47) and Cd (5.22 +/- 5.02) suggested a high risk of Cd and Mo for Enshi residents through consumption of rice. Environ Toxicol Chem 2019;38:1577-1584. (c) 2019 SETA

    Selenium translocation in the soil-rice system in the Enshi seleniferous area, Central China

    No full text
    Rice is an important source of selenium (Se) exposure; however, the transformation and translocation of Se in the soil-rice system remain poorly understood. Here, we investigated the speciation of Se in Se-rich soils from Enshi, Central China and assessed which Se species is bioavailable for rice grown in Enshi. Extremely high Se concentrations (0.85 to 11.46 mg/ kg) were observed in the soils. The soil Se fractions, which include water-soluble Se (0.2 to 3.4%), ligand-exchangeable Se (4.5 to 15.0%), organically bound Se (57.8 to 80.0%) and residual Se (6.1 to 32.9%), are largely controlled by soil organic matter (SOM) levels. Decomposition of SOM promotes the transformation of organically bound Se to water-soluble Se and ligand-exchangeable Se, thereby increasing the bioavailability of Se. The bioaccumulation factors (BAFs) of Se decrease in the following order: roots (0.84 +/- 0.30) > bran (0.33 +/- 0.17) > leaves (0.18 +/- 0.09) > polished rice (0.14 +/- 0.07) > stems (0.12 +/- 0.07) > husks (0.11 +/- 0.07). Selenium levels in rice plants are affected by multiple soil Se fractions in the soil. Water-soluble, ligand-exchangeable and organically bound Se fractions are the major sources of Se in rice tissues. (C) 2019 Published by Elsevier B.V

    Using CT radiomic features based on machine learning models to subtype adrenal adenoma

    No full text
    Abstract Background Functioning and non-functioning adrenocortical adenoma are two subtypes of benign adrenal adenoma, and their differential diagnosis is crucial. Current diagnostic procedures use an invasive method, adrenal venous sampling, for endocrinologic assessment. Methods This study proposes establishing an accurate differential model for subtyping adrenal adenoma using computed tomography (CT) radiomic features and machine learning (ML) methods. Dataset 1 (289 patients with adrenal adenoma) was collected to develop the models, and Dataset 2 (54 patients) was utilized for external validation. Cuboids containing the lesion were cropped from the non-contrast, arterial, and venous phase CT images, and 1,967 features were extracted from each cuboid. Ten discriminative features were selected from each phase or the combined phases. Random forest, support vector machine, logistic regression (LR), Gradient Boosting Machine, and eXtreme Gradient Boosting were used to establish prediction models. Results The highest accuracies were 72.7%, 72.7%, and 76.1% in the arterial, venous, and non-contrast phases, respectively, when using radiomic features alone with the ML classifier of LR. When features from the three CT phases were combined, the accuracy of LR reached 83.0%. After adding clinical information, the area under the receiver operating characteristic curve increased for all the machine learning methods except for LR. In Dataset 2, the accuracy of LR was the highest, reaching 77.8%. Conclusion The radiomic features of the lesion in three-phase CT images can potentially suggest the functioning or non-functioning nature of adrenal adenoma. The resulting radiomic models can be a non-invasive, low-cost, and rapid method of minimizing unnecessary testing in asymptomatic patients with incidentally discovered adrenal adenoma

    Accelerating the pace of ecotoxicological assessment using artificial intelligence.

    No full text
    Species Sensitivity Distribution (SSD) is a key metric for understanding the potential ecotoxicological impacts of chemicals. However, SSDs have been developed to estimate for only handful of chemicals due to the scarcity of experimental toxicity data. Here we present a novel approach to expand the chemical coverage of SSDs using Artificial Neural Network (ANN). We collected over 2000 experimental toxicity data in Lethal Concentration 50 (LC50) for 8 aquatic species and trained an ANN model for each of the 8 aquatic species based on molecular structure. The R2 values of resulting ANN models range from 0.54 to 0.75 (median R2 = 0.69). We applied the predicted LC50 values to fit SSD curves using bootstrapping method, generating SSDs for 8424 chemicals in the ToX21 database. The dataset is expected to serve as a screening-level reference SSD database for understanding potential ecotoxicological impacts of chemicals

    Deep radiomic model based on the sphere–shell partition for predicting treatment response to chemotherapy in lung cancer

    No full text
    Background: The prognosis of chemotherapy is important in clinical decision-making for non-small cell lung cancer (NSCLC) patients. Objectives: To develop a model for predicting treatment response to chemotherapy in NSCLC patients from pre-chemotherapy CT images. Materials and Methods: This retrospective multicenter study enrolled 485 patients with NSCLC who received chemotherapy alone as a first-line treatment. Two integrated models were developed using radiomic and deep-learning-based features. First, we partitioned pre-chemotherapy CT images into spheres and shells with different radii around the tumor (0–3, 3–6, 6–9, 9–12, 12–15 mm) containing intratumoral and peritumoral regions. Second, we extracted radiomic and deep-learning-based features from each partition. Third, using radiomic features, five sphere–shell models, one feature fusion model, and one image fusion model were developed. Finally, the model with the best performance was validated in two cohorts. Results: Among the five partitions, the model of 9–12 mm achieved the highest area under the curve (AUC) of 0.87 (95% confidence interval: 0.77–0.94). The AUC was 0.94 (0.85–0.98) for the feature fusion model and 0.91 (0.82–0.97) for the image fusion model. For the model integrating radiomic and deep-learning-based features, the AUC was 0.96 (0.88–0.99) for the feature fusion method and 0.94 (0.85–0.98) for the image fusion method. The best-performing model had an AUC of 0.91 (0.81–0.97) and 0.89 (0.79–0.93) in two validation sets, respectively. Conclusions: This integrated model can predict the response to chemotherapy in NSCLC patients and assist physicians in clinical decision-making

    Regulatory elements of <it>Caenorhabditis elegans</it> ribosomal protein genes

    No full text
    Abstract Background Ribosomal protein genes (RPGs) are essential, tightly regulated, and highly expressed during embryonic development and cell growth. Even though their protein sequences are strongly conserved, their mechanism of regulation is not conserved across yeast, Drosophila, and vertebrates. A recent investigation of genomic sequences conserved across both nematode species and associated with different gene groups indicated the existence of several elements in the upstream regions of C. elegans RPGs, providing a new insight regarding the regulation of these genes in C. elegans. Results In this study, we performed an in-depth examination of C. elegans RPG regulation and found nine highly conserved motifs in the upstream regions of C. elegans RPGs using the motif discovery algorithm DME. Four motifs were partially similar to transcription factor binding sites from C. elegans, Drosophila, yeast, and human. One pair of these motifs was found to co-occur in the upstream regions of 250 transcripts including 22 RPGs. The distance between the two motifs displayed a complex frequency pattern that was related to their relative orientation. We tested the impact of three of these motifs on the expression of rpl-2 using a series of reporter gene constructs and showed that all three motifs are necessary to maintain the high natural expression level of this gene. One of the motifs was similar to the binding site of an orthologue of POP-1, and we showed that RNAi knockdown of pop-1 impacts the expression of rpl-2. We further determined the transcription start site of rpl-2 by 5’ RACE and found that the motifs lie 40–90 bases upstream of the start site. We also found evidence that a noncoding RNA, contained within the outron of rpl-2, is co-transcribed with rpl-2 and cleaved during trans-splicing. Conclusions Our results indicate that C. elegans RPGs are regulated by a complex novel series of regulatory elements that is evolutionarily distinct from those of all other species examined up until now.</p
    corecore